Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make index macros not recreate indexes on incremental models #116

Merged
merged 4 commits into from
Jun 23, 2021

Conversation

infused-kim
Copy link

@infused-kim infused-kim commented Feb 21, 2021

If you create indexes on an incremental model like this:

{{
    config(
        materialized='incremental',
        as_columnstore = false,
        unique_key='my_key',
	    post_hook = [
		    "{{ create_clustered_index(columns = ['my_key'], 
                                                                 unique=True) }}",
		    "{{ create_nonclustered_index(columns = ['id']) }}",
	    ]
    )
}}

You get this error on subsequent runs:

  ('42S11', "[42S11] [Microsoft][ODBC Driver 17 for SQL Server][SQL Server]The operation failed because an index or statistics with name ‘my_table__clustered_index_on_my’_col already exists on table ‘my-db.dbt_xyz.my_table’. (1913) (SQLExecDirectW)")

This can be avoided by dropping the indexes in a pre-hook, but that leads to the index being recreated from scratch for all rows in the table. On a large incremental model that's probably not ideal in most cases.

This PR adds a check to the macros to only create the index if it doesn't exist yet. This way SQL server only adds the newly merged values to the index and doesn't reprocess all rows.

I don't think this would create any breaking changes for existing users of the macros. If they are already dropping indexes, it would not affect them.

And if they are using indexes on non-incremental models, then it should also not make a difference.

I have been using these macros for several months in production now and haven't had any issues.

This addresses the issue #25 I raised about this.

Fixes errors like the one below if you create indexes on incremental models and don’t drop them on each refresh:

  ('42S11', "[42S11] [Microsoft][ODBC Driver 17 for SQL Server][SQL Server]The operation failed because an index or statistics with name ‘my_table__clustered_index_on_my’_col already exists on table ‘my-db.dbt_xyz.my_table’. (1913) (SQLExecDirectW)")
@alittlesliceoftom
Copy link

@mikaelene I think this one should be approved, I've been using the macros (just from my local macros folder) and they work really nicely.

Big thanks @infused-kim for raising

@mikaelene
Copy link
Collaborator

Great! @infused-kim , can you update the changeling as well? Adding a row of which issue it solves etc...

@visch
Copy link

visch commented May 13, 2021

Works for me

@infused-kim

Here's an example changelog to add, I can rebase if you'd like?

New Features

  • Index Macros check to see if the index already exists before adding the index to your table

@semcha
Copy link
Contributor

semcha commented Jun 14, 2021

@swanderz Hi! What does it take to release this PR?

@dataders dataders merged commit ab9b6fd into dbt-msft:master Jun 23, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants